Statistical Inference for Topological Data Analysis PhD Thesis Proposal

نویسندگان

  • Fabrizio Lecci
  • Jessi Cisewski
  • Frederic Chazal
  • Alessandro Rinaldo
  • Ryan Tibshirani
  • Larry Wasserman
چکیده

Topological Data Analysis (TDA) is an emerging area of research at the intersection of algebraic topology and computational geometry, aimed at describing, summarizing and analyzing possibly high-dimensional data using low-dimensional algebraic representations. Recent advances in computational topology have made it possible to actually compute topological invariants from data. These novel types of data summaries have been used successfully in a variety of applied problems, and their potential for highdimensional statistical inference appears to be significant. Nonetheless, the statistical properties of the data summaries produced in TDA and, more generally, of the usually heuristic data-analytic methods they are part of, have remained largely unexplored by statisticians. Our analysis involves the tools of persistent homology, the main method of TDA for measuring the topological features of shapes and functions at different resolutions. A major part of our research also focuses on cluster trees and Reeb graphs, which provide a simple yet meaningful abstraction of the input domain of a function by means of the topological changes in its level sets. The main goal of this thesis is to contribute to the development of a statistical theory for TDA and to further propose new and statistically principled methodologies to improve and extend the applicability of the algorithms of TDA. In particular, we will (1) study tests of significance and confidence intervals to separate topological signal from topological noise; (2) explore new methods for topological dimensional reduction; (3) determine how our methods contribute to reduce computational costs, which currently represent an obstacle in TDA.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Monte Carlo Semantics: Robust Inference and Logical Pattern Processing Based on Integrated Deep and Shallow Semantic Representations

This document was submitted to the University of Cambridge Computer Laboratory as part of the documentation required by first year PhD candidates comprising a thesis proposal (Bergmair, 2007a) and a first year report (this document). In addition, a thesis draft (Bergmair, 2007b) has been submitted to supplement the required material. – For readers other than the examiners of this PhD project, i...

متن کامل

Topological Data Analysis of Clostridioides difficile Infection and Fecal Microbiota Transplantation

Computational topologists recently developed a method, called persistent homology to analyze data presented in terms of similarity or dissimilarity. Indeed, persistent homology studies the evolution of topological features in terms of a single index, and is able to capture higher order features beyond the usual clustering techniques. There are three descriptive statistics of persistent homology...

متن کامل

تبیین انتظارات اساتید دانشگاه علوم پزشکی ایران از دانشجویان دکتری در روند انجام رساله

Background: Knowing the expectations of supervisors may affect the quality of graduate students' theses. The aim of this study was to explore expectations of supervisors from Ph.D students in the process of performing Ph.D thesis as a qualitative content analysis design (conventional method). Methods: This qualitative study was conducted on 25 supervisor of Iran University of Medical Science...

متن کامل

تبیین انتظارات دانشجویان دکتری از اساتید راهنما در انجام رساله دکتری: یک تحلیل محتوای کیفی

  Introduction: Quality of research in PhD programs increases if supervisors become aware of students' expectations from them. This qualitative study aimed to explore expectations of PhD students from their supervisors was done.   Methods: This qualitative content analysis study was conducted on 22 graduated PhD students of Iran University of Medical Sciences, in 2014. The samples were purposef...

متن کامل

ILLC PhD Pilot Study

This is a proposal for a thesis in the interface of descriptive set theory and computable analysis, in a continuation of my MoL thesis [9]. There, I was interested in studying connections between game characterizations of classes of functions in descriptive set theory and the theory of Weihrauch reducibility, in particular searching for counterparts in computable analysis to the game characteri...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014